Active Bias: Training a More Accurate Neural Network by Emphasizing High Variance Samples
نویسندگان
چکیده
Self-paced learning and hard example mining re-weight training instances to improve learning accuracy. This paper presents two improved alternatives based on lightweight estimates of sample uncertainty in stochastic gradient descent (SGD): the variance in predicted probability of the correct class across iterations of minibatch SGD, and the proximity of the correct class probability to the decision threshold. Extensive experimental results on six datasets show that our methods reliably improve accuracy in various network architectures, including additional gains on top of other popular training techniques, such as residual learning, momentum, ADAM, batch normalization, dropout, and distillation.
منابع مشابه
Active Bias: Training More Accurate Neural Networks by Emphasizing High Variance Samples
[1] Bengio, Yoshua, Louradour, Jérôme, Collobert, Ronan, and Weston, Jason. Curriculum learning. In ICML , 2009. [2] Kumar, M Pawan, Packer, Benjamin, and Koller, Daphne. Self-paced learning for latent variable models. In NIPS , 2010. [3] Shrivastava, Abhinav, Gupta, Abhinav, and Girshick, Ross. Training regionbased object detectors with online hard example mining. In CVPR , 2016. [4] Avramova,...
متن کاملSolving the Bias-Variance Problem during Network Training
The following document outlines a theoretical approach to constructing an optimisation function for use in neural network training which could be used to solve the bias-variance dilemma, and thereby achieve optimal generalisation. The idea is rooted in quantitative use of probability and results in a cost function which embodies a form of orthogonal regression. A brief comment regarding biologi...
متن کاملHandwritten Character Recognition using Modified Gradient Descent Technique of Neural Networks and Representation of Conjugate Descent for Training Patterns
The purpose of this study is to analyze the performance of Back propagation algorithm with changing training patterns and the second momentum term in feed forward neural networks. This analysis is conducted on 250 different words of three small letters from the English alphabet. These words are presented to two vertical segmentation programs which are designed in MATLAB and based on portions (1...
متن کاملEvaluation of Ultimate Torsional Strength of Reinforcement Concrete Beams Using Finite Element Analysis and Artificial Neural Network
Due to lack of theory of elasticity, estimation of ultimate torsional strength of reinforcement concrete beams is a difficult task. Therefore, the finite element methods could be applied for determination of strength of concrete beams. Furthermore, for complicated, highly nonlinear and ambiguous status, artificial neural networks are appropriate tools for prediction of behavior of such states. ...
متن کاملComparing Two Methods of Neural Networks to Evaluate Dead Oil Viscosity
Reservoir characterization and asset management require comprehensive information about formation fluids. In fact, it is not possible to find accurate solutions to many petroleum engineering problems without having accurate pressure-volume-temperature (PVT) data. Traditionally, fluid information has been obtained by capturing samples and then by measuring the PVT properties in a laboratory. In ...
متن کامل